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Abstract 

This paper presents a new approach for channel tracking and parameter estimation in cooperative 
wireless relay networks. We consider a system with multiple relay nodes operating under an amplify 
and forward relay function. We develop a novel algorithm to efficiently solve the challenging problem 
of joint channel tracking and parameters estimation of the Jakes' system model within a mobile wireless 
relay network. This is based on particle Markov chain Monte Carlo (PMCMC) method. In particular, it 
first involves developing a Bayesian state space model, then estimating the associated high dimensional 
posterior using an adaptive Markov chain Monte Carlo (MCMC) sampler relying on a proposal built 
using a Rao-Blackwellised Sequential Monte Carlo (SMC) filter. 

I. Introduction 

The relay channel, first introduced by van der Meulen [1], has recently received considerable attention 
due to its potential in wireless applications. Relaying techniques have the potential to provide spatial 
diversity, improve energy efficiency, and reduce the interference level of wireless channels, see [2], [3] 
and [4]. 

There are a number of issues to be considered when designing a relay network, the more important 
of these include: the topology of the relay network; the number of hops in the relay; the number of 
relays to include in the network; and the type of relaying function to incorporate, in order to optimise 
transmission quality of service requirements. 
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In order to utilise the relay channel, an accurate channel state information (CSI) is required at the 
destination. Periodic insertion of known symbols (pilots) along with the transmitted data is widely used 
for channel estimation. For example, under an Orthogonal Frequency Division Multiplex (OFDM) system, 
some of the subcarriers are dedicated to pilot symbols [5], [6], [7], [8], [9]. 

All current works in the literature concentrate on performing channel estimation in a static environment, 
where the channels are assumed to be constant: in [10], the authors designed two linear estimators, namely 
the Least Squares (LS) and Linear Minimum Mean Square Error (LMMSE)) for Amplify and Forward 
(AF) based relay networks; in [11], an algorithm for LMMSE channel estimation in OFDM based relay 
systems was derived; and in [12], a training based LMMSE channel estimator for time division multiplex 
AF relay networks was proposed. 

The problem of joint channel tracking and parameter estimation for relay wireless links has not been 
addressed. This involves jointly estimating Jakes' model parameters for each link in the relay system, 
which we refer to as static parameters throughout the paper, and the non-linear channel tracking problem. 
This will be performed in a robust statistical estimation framework. The focus will be on channel tracking 
in a dual hop relay network in which the number of parallel relays is arbitrary and the type of relaying 
function can be general. 

The overall channel from the Base Station (BS) to the Mobile Station (MS) via the relay is a cascade 
of two links: the BS-relay link and the relay-MS link. Modeling the individual channels can proceed 
according to Jake's model. This provides a good approximation of the channels dynamics by using a first 
order Gauss-Markov processes [13], [14]. 

Contribution: in this work, we propose a novel statistical relay model to address the problem of 
inference and estimation for channel tracking and parameter estimation. We structure the problem of 
channel tracking such that the overall channel from source to destination is only estimated at the 
destination. The advantage of this approach is that very general relay networks with varying processing 
capabilities can be considered in this system. For example, the popular relay functionality of AF can be 
considered here without any additional computational overhead at the relay nodes, and all the statistical 
estimation and resulting computational effort is centralised at the BS. The statistical signal processing 
methodology we develop to address this problem utilises a novel development of a recent particle Markov 
chain Monte Carlo (PMCMC) algorithm [15]. PMCMC allows one to efficiently sample from very high 
dimensional, strongly correlated multivariate time series models. In the context of channel tracking, these 
distributions are obtained via filtering recursions, and the challenge is to jointly sample the latent process 
and static parameters in order to perform estimation. 
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The motivation of this paper is therefore to provide a system model and estimation procedure for relay 
network channel estimation. The consequences of this are that with improved channel estimates, tasks 
such as detection, synchronisation, power allocation and percoding can be improved. 

The paper is structured as follows: in Section II a stochastic system model is developed and a Bayesian 
inference methodology is provided. In Section III the estimation problem is presented, this involves 
development of the novel PMCMC sampling methodology and comparison to a standard less efficient 
approach is discussed. In Section V a complexity analysis of the proposed algorithms is provided. Section 
VI provides extensive simulation results firstly investigating algorithmic performance, and then providing 
detailed study of channel estimation at different SNR values. Conclusions are provided in Section VII. 

Notation: the notation used throughout this paper will involve: capitals to denote random variables 
and lower case to denote realizations of random variables; bold face to denote vectors and non-bold for 
scalars; super script will be used to refer to the index for a particular relay in the network; sub-script 
will denote discrete time, where h\ : T denotes . . . , hr\ and in the sampling methodology combining 
MCMC and particle filtering we use the following notation [-](j, i) to denote the j-th state of the Markov 
chain for the z-th particle in the particle filter. In addition, we denote the proposed Markov chain state 

by [-mO- 
ii. SYSTEM DESCRIPTION 

The system model considered in this paper is presented in Fig. 1. We consider the case where one 
mobile station is transmitting to a BS via L mobile relay links. For simplicity, we assume that the number 
of relay links L is constant, and that during the entire transmission, the mobile station and the relays 
communicate with the same BS. We consider frequency-flat fading characteristics, for example via the 
use of OFDM modulation. In a typical OFDM system some of the subcarriers are dedicated for use as 
pilot symbols, see [5], [6], We adopt this concept as it allows us to perform the channel tracking on a 
symbol by symbol basis within a frame. The filtering framework we develop updates the current estimate 
of the latent channel states at each symbol. In addition, the channel is assumed to be Rayleigh fading 
under constant velocity following Jake's model [5]. 

A. Bayesian system model 

In this section we introduce the Bayesian system model that we consider for channel tracking and 
parameter inference. In the context of this paper, online estimation refers to a frame by frame estimation 
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procedure. In this model, the channels in the relay network are treated as stochastic, where we do not 
know a priori the realized channel coefficient values. We now present the model assumptions. 

1 ) Model Assumptions: 

i. Assume a wireless relay network with one mobile source node, transmitting symbols in frames of 
length T. 

ii. The relays cannot transmit and receive on the same time slot and on the same frequency band. We 
thus consider a half duplex system model in which the data for a given frame are transmitted via 
a two step procedure. In the first step, the source node broadcasts a frame to all the relay nodes. 
In the second step, the relay nodes transmit the processed frame, termed the relay signals, to the 
destination node in orthogonal fashion, ie. non-interfering channels, see for example [4], [16]. 
In addition, we assume that all channels are independent with a coherence interval larger than the 
duration of the symbol. 

hi. We denote the mobile's angular Doppler frequency, relative to the l-th relay, by com and it is 
assumed to be random unknown and constant throughout a frame . 

iv. We denote the l-th relay angular Doppler frequency by Ur , and it is assumed to be random 
unknown and constant throughout a frame and independent from the mobile, the other relays and 
the base station. 

v. We assume that all channels are flat fading, and our model is general enough to consider both 
scenarios of slow and fast fading. 

vi. The channels are parametrized under Jake's model [17]. The l-th relay channel is modeled as a 
two stage latent stochastic process, in which at time n we denote the realization of the two channel 
stages by and g$. The distribution of each stage of the channel at time n is specified as 

H® ~ CM (0, al) Mobile => Relay channel 

(1) 

G$ ~ CM (0, a 2 g ) Relay =>■ BS channel. 
In addition we assume that the channels are temporally correlated and spatially i.i.d (between 
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relays). This corresponds to the following model assumptions 
e[#«g£')] =0, Vn,k,l,j 
eIhWh®] = 0, Vn,k,l^j 



E 
E 



G®G$] = a 2 h J (u® \k - n\\ , Vn, k, I 



J* T 



o (^°m \k — n\^j J (oo^ \k — n\ \ , Vn, k, I 



(2a) 
(2b) 
(2c) 
(2d) 



where, Jq is the zeroth-order Bessel function of the first kind. For details, see Section II in [18]. 
vii. The received signal at the l-th relay is a random variable given by 



R® = s n H® +W®, I G {!,...,£} 



(3) 



where at time n, Hj\) is the channel coefficient between the transmitter and the l-th relay, s n is 
the transmitted pilot symbol and W® is the unknown noise realization associated with the relay 
receiver. 

viii. The received signals at the destination is a random variable given by 

YM=fM(R$,n)GM + V«\le{l,...,L}, (4) 

where at time n, G$ is the channel coefficient between the l-th relay and the receiver, 
is the memoryless relay processing function (with possibly different functions at each of the relays) 
and vjp is the noise realization associated with the relay receiver. 
ix. All received signals are corrupted by i.i.d. zero-mean additive white complex Gaussian noise 
(AWGN). At the l-th relay the noise corresponding to the n-th transmitted symbol is denoted by 
random variable W® ~ CM (0, <r^). Then at the receiver this is denoted by random variable 
K (0 ~ CM (0, o*). Therefore: 



E 



E 



(m) 



E 



0. 



V (n, k) e {1, . . . , K } , V (I, m) e {1, . . . , L} , n ^ k, I ^ m 



2) Bayesian Model: We begin by specifying the latent dynamic model for the channel coefficients 
used to approximate Jakes' model, as studied in [13] 



G® = p®G® 1 + ^i-(^) 2 nV\ 



(5) 



where T® ~ CM (0, 1), Sty ~ CM (0, 1). 



(0 
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Next we formulate the Bayesian model, by first making precise the posterior parameters of interest in 
our Bayesian system model, a, (3, gi-.r, hi:T- To complete the information required to specify a Bayesian 
model we must present the remaining priors. The parameters for the channel coefficients are modeled as 
unknown a-priori and represent the uncertainty in the system parameters er^, , ojr\ given in eqs. 
(2c-2d). In particular since we consider a\ , a 2 g , Jf\ ujm to be unknown random variables, therefore we 
have that oft> and f3^> are also a priori random variables given by 

0(9 = a \ J fu® \k - n\ \ , Vn, k, I (6a) 
a(9 = CT 2 j Q ^« _ n \\ j Q f u @ \k - n \ \ , Vn, jfe, /. (6b) 

Hence, we specify priors on the parameters «(9 and /3"> as follows, 

a ( 9 ~ Beta (a, 6) , 

(V) 

/3 ( 9 ~ Beta (c, d) , 

where Beta (x; a, 6) = B ^ fc j x a-1 (1 — x)^ 1 , and B(-) is the beta function. 

Furthermore, for simplicity, the same prior is used for all relay nodes. This prior choice is made to reflect 
the physical reality of the model. More specifically, to insure a stationary model, a support of [0, 1] is 
required for the prior. Secondly, it should be possible to insure that the majority of the prior mass is 
located close to the right boundary to insure realistic Doppler offsets scenarios. The choices of a, b and 
c, d reflect a realistic scenario of transmission in which the velocity of both transmitter and relay is 
practically achievable. 

Remark 1 - The Bayesian model presented encompasses the case where the relays are both mobile or 
stationary. In case of stationary relays, the angular Doppler frequency of the £-the relay ,uif , is set to 0. 

Remark 2 - We can now exploit directly the properties of the relay model proposed by considering the 
model assumptions which result in a non-linear state space model formulation for the system. The latent 
state processes corresponding to Jake's channel model at each stage of the relay system are given in eq. 
(5), these provide the state update equations for hi : T and gi : y as well as the unknown model parameters 
cx and (3. The non-linear observation equations, relating the transmitted data to the received data at 
the destination, after propagation through the relay network are given in eq. (4). Note, the observation 
equation also introduces auxiliary variables to the state space model corresponding to the relay noise 
wij. Therefore the problem we address in this paper involves the following marginal posterior for 
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model parameters and channel states 



jp(a,/3,gl:T,hl:T|yi:T) 



n 

1=1 



II p W° l« (0 . f> m , ^ , ^ ) p ( ^° l^i) p \£U 



.n=l 



,(0 



(8) 



where a^ 1:L \ P^ 1:L \g^^ , yf.'^ — a, {3, gi-.T, hi:T, Yi:T- In addition it is worth noting that this 

decomposition of the posterior distribution utilizes directly the relay model structure. This posterior 
model that we aim to estimate is very high dimensional with L(3T + 2) parameters and not tractable for 
standard Bayesian estimation procedures. For this reason we require advanced computational tools such 
as PMCMC that will be presented in Section III. These techniques are specifically designed to efficiently 
allow one one to work with such statistical models in relay networks with multiple relays and multiple 
hops even in high dimensions. 

Corollary 1: As a consequence of the model assumptions, the posterior distribution factorises ac- 
cording to the following independence structure 



i=i 



(9) 



with respect to the number of parallel relay transmission paths. 



Remark 3 - Since the posterior factorises to produce independence between relay transmission paths, 
this enables us to exploit this structure in the design of the estimation framework. In particular, the 
approach that we take which involves PMCMC now admits a natural block factorisation structure in 
which the particle filters per block may be run independently. The details relating to this remark will be 
presented in Section III. This means that we are able to estimate the proposal distribution, in the PMCMC 
algorithm for the channels trajectories, via parallel independent particle filters. As a result of this, the 
variance of the incremental important weights used in the estimation of the proposal can be reduced 
leading to an increased PMCMC acceptance probability. Therefore, under the model and estimation 
procedure proposed, increasing the number of relays will not degrade the solution. These properties 
of the system model are well accepted, see [16]. However, the estimation procedures we present are 
general and can be applied also to solutions with spatial correlation, in which this decomposition no 
longer applies. 
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Corollary 2: For relay network topologies with multiple hops, K, per parallel transmission path I, 
the posterior distribution factorises according to the following independent structure 



p (0, A 1:T |y 1:T ) = f[p (0®, A^ly^) , (10) 



(0 i..(0 

l=i 

where 0® = [#^'°), . . . , Q( l > K >^ corresponds to the unknown static parameters for the l-th path; and 

corresponds to the channel gain at the n-th epoch of the l-th relay path. 



A (0 



a('.0) Al,K) 
i\. n , . . . , j.\ n 



Remark 4 - From this we see how the methodology scales as a function of the number of relay hops 
K. We note that as K increases one should be careful to ensure the proposal for the K channels in the 
l-t\\ relay link for the particle filtering aspect of the PMCMC algorithm is designed to approximate the 
optimal importance distribution for each time n. The reason for this is that as the dimension of the state 
at epoch n for a given particle filter block increases, one must always be careful to reduce the variance 
of the incremental importance sampling weights, therefore improving the acceptance rate of the PMCMC 
algorithm. Therefore, under the model and estimation procedure proposed, increasing the number 
of hops will increase the variance of the solution obtained. The details regarding this remark are 
presented below. 

The relay system statistical model proposed can be summarised by the graphical model structure 
presented in Fig. 2. 

III. Channel Tracking Sampling Framework 

In this section we first discuss the challenges associated with estimating the model and performing 
channel tracking under the Bayesian relay model framework developed in Section EL We begin by 
outlining a common approach to tackle the problem of joint static parameter and latent dynamic process 
estimation. We discuss the complications that arise when using this common approach to solve the channel 
tracking problem. We then present a novel sampling algorithm to overcome these difficulties, based on 
Adaptive MCMC and Particle MCMC, which we term the Adaptive PMCMC (AdPMCMC) algorithm. 

The channel tracking problem involves developing an efficient algorithm to estimate the unknown vector 
of parameters a, (3 and the vector of latent process channels states Hx-.t, Gi-.t- To achieve this we consider 
the augmented Bayesian posterior p (a,/3, gi : T, hi:T, Wi : r|yi:T)> containing auxiliary variables Wi : t 
which we marginalise out numerically in our sampling algorithm to obtain the posterior corresponding 
to (8). Under this posterior distribution, we can find a solution to the channel tracking problem which 
involves obtaining point estimates such as the Maximum a-posteriori MAP (posterior mode), the Minimum 
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Mean Square Error MMSE (posterior mean) and posterior credible intervals for {a, (5, Gv.t, Hi : t}, given 
yi : T- In this paper we refer to efficiency of a sampling algorithm as related to the mixing rate of the 
Markov chain. That is how rapidly the Markov chain can reach the stationary regime of the posterior 
from an arbitrary intialisation. In particular the more efficient the mixing rate of an algorithm, for a 
given computational complexity, the more accurate the estimated posterior quantities will be, therefore 
with more accurate channel estimates, we can then improve resulting estimation challenges such as 
synchronisation, power allocation and percoding. 

A. Standard MCMC-Gibbs sampler approach 

There are several sampling approaches that can be considered for state space models involving unknown 
states and parameters. Details and discussion of standard algorithms in this context can be found in [19], 
[20], [21], [22] and the references therein. Such approaches in the context of state space models are 
typically inefficient. This is because they are susceptible to inefficiencies arising from the very high 
dimension of the problem, especially when correlation is present in the posterior. Therefore, though tech- 
nically valid, in many practical settings this precludes the use of naive sampling strategies from a practical 
computational cost. Typically these approaches involve splitting the high dimensional posterior distribution 
into subblocks of parameters and then running a blockwise Metropolis-Hastings (MH) within Gibbs 
sampling framework. One such approach that we consider to compare to our AdPMCMC methodology is 
based on a basic Markov chain Monte Carlo algorithm involving a Gibbs sampling framework. Basically 
this involves a sampling framework in which the posterior denoted p (a, (3, gi : T, hi:T> wi-.tIyi-.t) is 
sampled by splitting the vector of latent states into k sub-blocks of length r, where kr = T, and each 
iteration of the Markov chain updates each sub-block of the states in either a deterministic or random 
scan until a Markov chain of length J is obtained. This is presented in Algorithm 1, and is just one 
of many possible block structures that could be used. The full conditional posterior distributions are 
typically sampled from via a MH-within-Gibbs sampling framework. 

The simplest approach is to sample the univariate full conditional distributions, ie. k = T blocks of 
size r = 1. However, this sampling scheme will typically result in very slow mixing of the Markov chain 
around the support of the posterior, making this naive MCMC algorithm computationally impractical. 
This is especially problematic in high dimensional target posterior distributions, leading to poor channel 
estimates with high variance. It is well known that to avoid this slow mixing Markov chain setting, one 
must sample from larger blocks of parameters. However, the design of an optimal proposal distribution 
for large blocks of parameters is very complicated. The efficiency of such a naive block Gibbs sampling 
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algorithm is dependent both on the choice of blocking of the posterior parameters, the size of the blocks 
updated at each stage and the sampling mechanism for each block. In general it is a significant challenge in 
practice to design algorithms which are efficient in such block Gibbs settings when correlation is present 
between the parameters of the posterior distribution, as will occur in non-linear state space setting of 
channel tracking. As a result for moderate sized values of r the simple MH-within-Gibbs framework 
can be poorly mixing, due to low acceptance probabilities, even when carefully design proposals are 
implemented. This leads to requirements for very long Markov chain lengths which is not practical. 
We overcome these well known problems of naive MCMC sampling algorithms by developing a novel 
version of the PMCMC methodology [15] utilizing adaptive MCMC and SMC algorithms in a nonstandard 
manner as proposal structures, explained below. In section III-B we explain in detail the properties and 
specification for this non-standard algorithm. The methodological innovation we present in this paper is 
to combine an Adaptive MCMC algorithm within the PMCMC framework with a Rao-Blackwellised SIR 
particle filter allowing us to update (a,/3,gi : T,hi : r,Wi : T) in a single efficient iteration of the Markov 
chain. 

The Markov chain state vector, for the relay model in Section II, is in a very high dimension of 
L(3T + 2) parameters. It is therefore critical to any MCMC mechanism to attempt to approximate the 
optimal proposal distribution. The AdPMCMC sampler we develop achieves this by approximating the 
optimal proposal, through a combination of an Adaptive MCMC and Rao-Blackwellised SIR particle 
filter. This improves the MCMC algorithm significantly. 

B. Advanced Adaptive MCMC within Rao-Blackwellised Particle MCMC (AdPMCMC). 

The aim of this section is to present a novel methodology to perform sampling form the posterior 
distribution given in eq. (8). 

The PMCMC approach we develop works by approximating the marginal Metropolis Hastings algo- 
rithm. It therefore updates at each iteration of the Markov chain, the joint channel and parameter 
vector [a, (3, hi-T, gi:T> Wi : r] in an efficient manner. To achieve this the PMCMC algorithm involves 
approximation of the optimal proposal distribution for the latent states p(hi : T, gi-r, Wi : t|o:, /3, yi-.r) with 
a particle filter estimate on the path space, which can be easily sampled from. In addition, the marginal 
likelihood p(yi : T\cx, 0) which is used in the evaluation of the acceptance probability can not be obtained 
analytically. This is due to the fact that marginalization of the joint likelihood over the path space involves 
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the following integration 

T-l 



p(y 1:T \a, /3) = JJp(y t+ i|yi :t ,o:,/3) 



i=0 




J p(y t+ i |h t+1 , g m , w t+ i , a, (3)p(h t+1 , g m , w t+ i |y 1:t , a, (3) <ihi :T dgi :T (iw 



1:T 



which can not be performed analytically. Note we define y\-Q = 0. It can however also be estimated 
efficiently using the same particle filter used in the proposal estimate. Remarkably the key result of [15] 
was to prove that for the PMCMC algorithm utilizing these two particle filter approximations, no matter 
how many particles are utilised in SMC algorithm for each approximation, the stationary distribution of 
the PMCMC Markov chain is unbiased. In particular for our model, the stationary distribution of our 
PMCMC algorithm still remains the target posterior distribution in (8). 

The Particle MCMC proposal distribution to move from a stat at iteration j to a new state at iteration 
(j + 1) is split into two components. 



The first involves a proposal kernel which is constructed via an adaptive MH scheme and this will be 
used to sample the static parameters a, (3. The second component of the proposal kernel involves the 
sampling of a trajectory for the latent channels and relay noise, gi : T, hi : T, Wi : y. The resulting PMCMC 
approach is presented in Algorithm 2. The remainder of this section will first detail the Adaptive MCMC 
component of the PMCMC proposal mechanism, followed by the filtering component of the proposal. 
This section is completed with the details of the marginal MH acceptance probability for the AdPMCMC 
algorithm. 

The introduction of an adaptive MCMC proposal kernel into the Particle MCMC setting allows the 
Markov chain proposal distribution to adaptively learn the regions in which the marginal posterior 
distribution for the static model parameters has most mass. As such the probability of acceptance under 
such an adaptive proposal will be significantly improved over time. Then for the latent channels processes 
we develop the non-standard PMCMC proposal kernel constructed via a Sequential Monte Carlo algorithm 
which will be based on a Rao-Blackwellised SIR filter [23]. The Rao-Blackellisation is performed via 
a conditional Kalman filter structure. The conditioning is specifically chosen to allow one to work with 
arbitrary numbers of relay hops and also more importantly an arbitrary non-linear relay function. In 
particular, this involves a particle filter for hi : y, wi : t and a conditional Kalman filter proposal for gi : r- 



q ([a, f3, gi :T , hi : T, wi; T ] (j); [a, 0, g 1:T , h 1:T , wi :T ] (j + 1)) 



(11) 



q([a,(3] (j); [a,/3] (j + 1)) p ([h 1:T , g 1:T , w 1:T ] (j + l)|y 1:T) [a,/3] (j + 1)) . 
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In the state space setting, the PMCMC algorithm used to sample from a target distribution (8) proceeds 
by mimicking the marginal MH algorithm [15] and [24] in which the acceptance probability, denoted by 
A, going from state at iteration j of the Markov chain to iteration (j + 1), is given by 

A([a,/3,hi : T,gl:T,Wi :T ] (j); [«,/3,hl:T,gl:T,Wi :T ] (j + 1)) = 

• L pJ*S (j + i)|yi:r) g([q,fl (j + 1); [q.fl U)) \ (12) 

V' P ([ a ,f3}(j)\y 1:T )q([ a ,f3}(jy,[ a ,/3}(j + l)) )■ 
Clearly, achieving this requires one to use a very particular structure for the proposal kernel in the 
MCMC algorithm, given in eq. (11). In particular after substitution of this proposal into the standard MH 
acceptance probability, we obtain the marginalised form given by (12). 

The critical idea in formulating the Particle MCMC algorithm is that the proposal distribution for 
p ([hi t, gi:Tj wi t] (j + 1) \yi-.T, [ a , P] (j + 1)) can be sampled from approximately via a Sequential 
Monte Carlo algorithm (otherwise known as a particle filter). Sequential Monte Carlo, [25], [26], 
[27] refers to a class of algorithms which have become popular due to their algorithmic and theoretical 
properties, especially in filtering problems in which a non-linear or non-Gaussian state space model is 
considered. 

In [15], the PMCMC sampler has been shown to have several theoretical convergence properties. In 
particular, [26] derives convergence in the empirical law of the particles to the true filtering distribution 
at each iteration is bounded as a linear function of time t and the number of particles N. In addition a 
Central Limit Theorem can be obtained, 

|| £aw([h t ,g t ,w t ] - p (h t , gt, w t |y w , [a,/3] (j)) \\< C-, 

(p(y t \a,(3)-p(y t \a,(3))^M{0,a?) , 

where p(y t \a,(3) is the normalising constant of the filtering distribution and of < D jj. These results 
are important as they demonstrate that the complexity of the problem only scales linearly with dimension. 

The other innovation we introduce to the proposal mechanism of the PMCMC algorithm involves devel- 
oping a non-trivial adaptive MCMC approach for the static parameters proposal (q ([a, (3] (j); [a, (3] (j + 1))) 
in (11). There are several classes of adaptive MCMC algorithms, see [28]. The distinguishing feature of 
adaptive MCMC algorithms, compared to standard MCMC, is that they utilise a combination of time 
or state inhomogeneous proposal kernels. Several recent papers proposed theoretical conditions that our 
approach satisfies, ensuring ergodicity of our adaptive algorithms, see [21], [29], [30], [31] and [32]. 

In [28] ergodicity of adaptive MCMC is proved under conditions known as Diminishing Adaptation and 
Bounded Convergence. As in [28] we assume that each fixed kernel in the sequence Q 1 has stationary dis- 
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tribution P (•) which corresponds to the marginal posterior of the static parameters. Define the convergence 
time for kernel Q 7 when starting from state a, (3 as M e (a, (3, 7) = inf{s > 1 : \\Qt (o;,/3; •)— P (•) || < e. 
Under these assumptions, they derive the sufficient conditions; 

• Diminishing Adaptation: lim n _ ! . 00 sup Q , 5 /3 S £;||Qr s+1 (ol,/3, •) — Qr s •) || = in probability. 
Note, T s are random indices. 

• Bounded Convergence: {M e ([a,/3] (j),Fj)}JL is bounded in probability, e > 0. 
which guarantee asymptotic convergence in two senses, 

• Asymptotic convergence: lim.,_ >00 ||,Caw ([a,/3] (j)) — P(cx,f3) \\ = 

• Weak Law of Large Numbers: lim^oo i 2~2l=i 4>{[° l -i P] (*)) = / ^>{ol, f3)P(da, df3) for all bounded 

4>:E^R. 

Algorithmic Choices and Specifications for the AdPMCMC Algorithm 

We present the specific details of the adaptive MH within Particle MCMC algorithm used to sample from 
the posterior on the path space of our latent factors and state space model parameters. 

1) Adaptive MCMC for static parameters a, (3: here we detail the specifics of step 3 in Algorithm 2, 
which involves specification of the proposal distribution in the Particle MCMC algorithm (see eq. (11)). 
The static parameters are updated via an adaptive MH proposal comprised of a mixture of Gaussians. 
One of the mixture components has a covariance structure which is adaptively learnt on-line. The mixture 
proposal distribution for parameters [a,/3] at iteration j of the Markov chain is given by, 



Here, Ej is the current empirical estimate of the covariance between the parameters of a, [3 estimated 
using samples from the PMCMC chain up to time j, and w\ is a mixture proposals weight which we 
set according to the recommendation of [28]. The theoretical motivation for the choices of scale factors 
2.38, 0.1 and dimension d are all provided in [28] and are based on optimality conditions presented in 
[33]. The description is depicted in Algorithm 3. 

2) Rao-Blackwellised SIR filter specifications: the proposal kernel for the unknown channels gu, hi : T 
and auxiliary variables w±-t, denoted by p (gi-.r, i^x-.T, Wi-.T\yi:T, [ot,/3] {j)), involves the SIR particle 
filter in which Rao-Blackwellisation is achieved via a Kalman filter. As such, it is an approximation to 



q([a,/3] (j); [a,0] (j + 1)) = Wl N [a, (3] (j + 1); [a,/3] (j) 



(2-38) 2 ^ 
~d~ ^ 




(13) 



+ (l-w 1 )N\ [a, /3] (j + l)\[a,0\ (j), 
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the optimal proposal. The following decomposition is used 

P(gt»h*,w t |yi:t, [a,0] (j)) =p(gt|h t ,w t ,y 1 .i, [a, (3] (j)) x p(h t ,w t \y 1:t , [a,/3] (j)), (14) 

V v ' V v ' 

Kalman filter Particle filter 

where for each particle in the SIR filter for h t , ~wt, there is a corresponding Kalman filter for g t . This is 
presented in Algorithm 4. 

IV. Cramer-Rao Lower Bound for the Path Space Proposal in PMCMC 

In this section we study the Mean Square Error for estimation of the relay channel estimations after 
integrating out the uncertainty in the static parameters a, (3, for a range of SMC particle counts N. This 
will provide for us an understanding of the accuracy of our methodology AdPMCMC in estimating the true 
underlying process for a given signal to noise ratio. We also derive a recursive expression for the Bayesian 
Cramer-Rao Lower Bound (BCRLB) as a lower bound comparison. We demonstrate how the BCRLB can 
be trivially estimated at no additional computational cost in our model framework, recursively for each 
time step t, via the AdPMCMC algorithm and a modified recursion from [34]. We derive analytically 
these results for any unbiased estimate of the marginal latent processes, gi : T, ^-i-.t, wu, conditional on a 
realisation of the static model parameters, [a, f3] (j). We then show how this can be calculated recursively 
on the path space at each iteration of the PMCMC algorithm for each realized data set, allowing us to 
numerically evaluate 



Xi : T — Xi-.T 

Xl;T — Xl;T 



E 



^ i H 
Xl;T — Xl;T 

Xi;T — Xi;T 

Xi : t - Xi : t 



P (xi:T, yi:T) oc, f3) dxi^rdyi :T dad/3 



H 



p(x.i:T,yi:T\oL,/3)p{a,/3) dx.i: T dyi:Tdad(3 (15) 

H -\ 

\a, (3>p (a, p) dad(3, 



Xi:T — Xi;T 

where xi : t — [hi : T, Ei-.t, Wu]. Note, the Monte Carlo integration involved in the estimation of eq. (15), 
does not require any additional computational complexity, as it is evaluated online at each iteration of 
the PMCMC algorithm for each data set generated. This BCRLB provides a lower bound on the MSE 
matrix for the path space parameters which correspond in our model to the estimation of the relay channel 
models. 

We denote the CRLB by [Fi : t (gi:Tj ^-i-.t, w^)] (j) and [Ft (gt, h t , w*)] (j) will denote the Fisher 
Information Matrix (FIM) for symbol t of the frame in the path space, conditional on the proposed static 
parameters at iteration j of the PMCMC algorithm. 
Cramer-Rao Lower Bound for the Path Space Proposal in PMCMC 

In the Bayesian context we do not require that the estimator of interest, in our case X^jK , be unbiased. 
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However we do require that the model is specified such that the following two conditions hold. 
Condition 1: for static state space model parameters for each relay link I which are within some 
interval € [0,1] and 0(0 E [0,1], the prior models p(a^) and p(f3®) satisfy the conditions 
lim a (o ->op(a®) -> 0; lim^i)^oP(P^) 0; ]un- a w->xP( a ®) ~> and ^ m /3w^\p(P^) ->■ 
Condition 2: The following smoothness properties of the likelihood hold: 

f dp(yi: T \a,(3) 

Under these conditions we may then utilize the results of [34] in which recursive expressions for the 
BCRLB are derived for general non-linear state space models. In particular we consider the recursion in 
time t, for the Bayesian equivalent of the Fisher information matrix on the estimate of X^ MSE , given 
in eq. (21) of the paper. We modify these results to integrate out the posterior uncertainty in the joint 
estimation of the static parameters a, [3 parameterizing the state space models. In particular we derive 
results which can perform this marginalization numerically utilizing the existing AdPMCMC framework. 

The BCRLB provides a lower bound on the MSE matrix for estimation of the path space parameters 
which correspond in our model to the estimation of the latent process states Xx : t- We denote the Fisher 
Information Matrix (FIM), used in the CRLB, on the path space by [F\-t (xi : t)] (j) and marginally by 
[F t (xt)] (j) for time t in the path space, conditional on the proposed static parameters at iteration j 
of the PMCMC algorithm. Here we derive an analytic recursive expression for this quantity based on 
[34]. In some cases we can get analytic solutions and in others, we will resort to AdPMCMC based 
online approximations with a novel estimation based on the particle filter proposal distribution of our 
AdPMCMC algorithm. 

Conditional on the previous Markov chain state [a, [3, Xi : t] (j — 1) and the new sampled Markov chain 
proposal for the static parameters at iteration j, [cx,f3] (j), we obtain the following modified recursive 
expression for the FIM based on eq. (21) in [34]: 

J t (X t )] (j) = [Df^t)] (j) - [D^(%)] U) {[Jt-i(%)] U) + [A H i(xt)] (j))'' (J), 

(16) 

where we obtain the following matrix decompositions of our system model, via Eqs. (34-36) of [34] 
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under the model assumptions of additive Gaussian process and observation noise: 
[Ji($i)] (j) = -E Tv xi {V Xl logp( Xl )} Tl 



[A n x] (i) 



-E 



V Xt _, {V Xt _ 1 logp(x t |x^ 1 )} T l =E{[V Xt _ 1 /(x t _i;a,)3)] [V X( _J (x t _ i; a, 



[A^i] (3) = [A-i] (J) 



-E 



V Xt {V Xt _ 1 logp(x i |x t „ 1 )} T = -E [V Xt _J(x t -i;cx,P)] Q7_\; 



[A 2 ^i] (i) 



-E 



V X( {V Xt logp (x t |x t _i)} J 



+ -E 



V Xt {V Xt logp(y t |xi)} 



T 



QT-i + E { [V Xf (x t ; a, /3)] i? t 1 [V^/i (x t ; a, /3)] T } 



(17) 



where /(xt_i;a,/3) is the state model with process noise covariance Qi and h(xt;a,/3) is the ob- 
servation model with observation noise covariance R t . Next we derive these quantities for each model 
conditional on the previous Markov chain state [a, (3, gi-.r, hi : T, wu] (j — 1) = [a, (3, Xi-t] (j — 1) and 
the new sampled Markov chain proposal for the static parameters at iteration j, [a, (3] (j). 

[D^] (j) = -E [Vx^ {V^ log^x^))^ 











i-[/?](i) 2 




(18) 



(j) = [D&] (,") 



-E 



V Xi _, { V Xt _! log]? (Xt|Xi_!)) 



T 



MC?) 
i-H(i) 2 








Pitt) 



l-[/3](i) 2 " 




(19) 



We obtain the following matrix decompositions of our system model, via Eqs. (34-36) of [34], 



[D?^] (j) = — E [V xt {V Xt logp (x t |x t _!)) 5 



E 



i-l«JW) : 



i PL. 







+ 



Hf±gS 



V Xt {V Xt logp(yi|xf)) J 







(20) 



4 o -4- + i 

In the simulation results Section we evaluate for each iteration of the PMCMC chain, j, the following: 
Trace Yl t = ^ T (g 4 ,h t ,w t )^ (j). We produce distributional estimates of this quantity for the 
J iterations of the Markov chain for different SNRs. 

Remark 5 - In our model framework we get analytic expressions for the BCRLB recursion. However, 
a key point about utilising this recursive evaluation for the FIM matrix is that in the majority cases one 
clearly can not evaluate the required expectations analytically over the joint distribution of the data and 
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latent states. However, since we are constructing a particle filter proposal distribution for the AdPMCMC 
algorithm to target the filtering distribution p (xt|yi : t, we can use this particle estimate to evaluate 

the expectations at each iteration t for each data set. It is important to note that this recursion avoids 
ever calculating the expectations using the entire path space empirical estimate, only requiring marginal 
filter density estimates, which wont suffer from degeneracy as a path space empirical estimate would. 

V. Complexity Analysis 

The complexity analysis of the class of algorithms presented in the previous sections can be studied 
from two perspectives; the most technical of these involves theoretical study of the mixing rate of the 
MCMC algorithms under consideration, the other focus would be on the computational complexity. Here 
we focus on a computational complexity comparison between each of the algorithms. The computational 
cost of each of these algorithms can be split into three parts: the first cost involves constructing and 
sampling from the proposal; the second significant computational cost comes from the evaluation of the 
acceptance probability for the proposed new Markov chain state; and the third is related to the mixing 
rate of the overall MCMC algorithm as affected by the length of the Markov chain required to obtain 
estimators of a desired accuracy. We define the following building blocks for a single MCMC iteration 
and their associated complexity: 

1) Sampling a random variable using exact sampling wO(l) 

2) Likelihood evaluation of JlLi Uti P (vn , , 9n , /4°) ~ TL (C m + C a ) + O (1) 

3) Prior evaluations of nLi Wti P P ($ \g®_ x ) p (g?) p (h?) p (a®) p (/3 W ) » 6TLO (1) 
Based on these building blocks we estimate the overall complexity of the proposed algorithms as follows. 

A. Construction and sampling of PMCMC proposal 

1) Adaptive MCMC component: (Step 3 of Algorithm 2) Complexity (2L x 2L + 2L) O (1) 

2 ) Rao-Blackwellised SIR filter component: (Step 4 of Algorithm 2) 

• Kalman filter component : TLO(l). 
. SIR filter component : 2NLTO (1). 

• Evaluation of marginal likelihood: NTO(l). 

• Sampling SIR filter path space proposal: TV© (1). 

3) Evaluation of AdPMCMC acceptance probability: (Step 7 of Algorithm 2) Complexity (NT + AL) O (1). 
Therefore, the total cost of a single AdPMCMC iteration can be approximated as 

(2L 2 + TL + NT (2L + 2) + N) O (1). 
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B. Construction of MCMC proposal 

The total computational complexity of one iteration of the deterministic scan Gibb sampler, in Algo- 
rithm 1, involves L (3T + 2) parameters and requires updating and accepting each proposed move. This 
produces L (3T + 2) x (2TL + 2) O (1) = (6T 2 L 2 + 10TL + 4L) O (1) operations. 

VI. Results 

The aim of this section is to demonstrate the performance of the AdPMCMC algorithm in performing 
channel tracking and estimation. To do this we separate the analysis into two sub-parts. We begin by 
analysing properties of the Adaptive MCMC and the Rao-Blackwellised particle filter proposal distribution 
in the context of the PMCMC algorithm. This involves analysis of Markov chain paths, the acceptance 
probability and the SIR filter performance. 

These aspects will be studied under three different settings: the dimension of the posterior is increased 
by increasing the length of the frame T; the number of particles in the Rao-Blackwellised SIR filter, N, is 
increased; and the SNR is varied. In addition, we demonstrate the significant improvement in computation 
efficiency that our AdPMCMC algorithm has over MH-within-Gibbs. 

The second part considers the wireless communications analysis. It aims to address the question of 
how well this proposed methodology solves the difficult problem of joint channel tracking and estimation 
by considering the estimated MSE of the channels and the distribution of the MMSE for the parameters 
of the non-linear state space models. 

A. Analysis of AdPMCMC algorithm performance versus T, N and SNR 

In this section we evaluate the performance of joint channel tracking and parameter estimation under 
the AdPMCMC framework. The network topology used in the simulations involved a single relay 
network, K = L = 1. In all simulations we ran a Markov chain of length 50,000 iterations, dis- 
carding the first 15, 000 samples as burnin. We then systematically varied each of the three variables 
(T, A r ,SNR) and assessed the performance of the AdPMCMC algorithm. We took values of N ranging 
from 10, 20, 50, 100, 200, 500, 1, 000 and 5,000. The length of frame considered involved T ranging from 
50, 100 and 200 symbols per frame, leading to posteriors to be sampled from in dimensions 152, 302 
and 602 respectively. Finally, we consider SNR levels from dB through to 25dB which covers a wide 
range of possible operating environments. 

The prior distributions for the parameters a and f3 were identical and specified to have a Beta 
distribution, £e(10,0.6). 
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1 ) Analysis of the number of particles N versus acceptance rate of AdPMCMC algorithm: in this 
section we study the performance of the PMCMC algorithm average acceptance probability as a function 
of the number of particles N. This is meaningful as it allows us to recommend a setting for N. To achieve 
this we consider a = (3 = 0.95 used to generate data corresponding to a flat fading channel model with 
an SNR of 15dB and frame length of T = 100. 

In Fig. 3 we present the average acceptance probability for the AdPMCMC algorithm for SNR=15dB, 
T=100, corresponding to a posterior distribution in 302 dimensions. The key finding of this study is that 
we only require a very small number of particles to obtain accurate estimation and efficient performance in 
our PMCMC algorithm. In particular we see that as expected, when the number of particles increases, the 
average acceptance probability of the joint proposal in the AdPMCMC Markov chain increases. Secondly, 
we note that even for a relatively small number of particles, N = 100 we obtain average acceptance rates 
around 20%. This is a very good indication that it is suitable to work with the Rao-Blackwellised SIR 
filter for our proposal. It is typical in the MCMC literature to tune a proposal mechanism in a standard 
MCMC algorithm to produce acceptance rates between [0.2, 0.5] and in some cases it is provably optimal 
to use 0.234, see [33]. 

2) Analysis of the estimated MMSE versus SNR: in this section we study the impact that the SNR will 
have on the estimation and channel tracking under the AdPMCMC algorithm with N = 100, T = 100, 
a = (3 = 0.95 and the SNR ranging from low, medium and high corresponding to OdB, 15dB and 25dB 
respectively. The sequence of subplots in Figs. 4 and 5 demonstrate the MMSE estimates of the channel 
estimations for a frame obtained from the AdPMCMC algorithm, versus the SNR. 

The first set of subplots demonstrates the improvement of the MMSE estimates of the channel h.\-T 
as the SNR increases. Additionally, we provide the 95% posterior confidence intervals for the MMSE 
estimate on the path space, which demonstrates a reduction in uncertainty as the SNR increases. The 
second set of subplots demonstrate the same quantities for the estimation of channels gi.y under this 
scenario. 

Next we compare the MMSE results of the basic MH-within-Gibbs sampler of Algorithm 1 to the 
MMSE results we obtained for the AdPMCMC. In order to perform a fair comparison with respect to 
algorithmic complexity, detailed in Section V, we set the length of the MH-within-Gibbs Markov chain 
to produce the same computational expense as the AdPMCMC sampler. This produced very poor results 
for the Gibbs sampler, and so, instead we present here an increase of computational cost of the Gibbs 
sampler by roughly 100 times the length of the Markov chain. In Fig. 6 we present the MMSE estimates 
for the path space parameters hi : r since gi : r- These results were obtained using the MH-within-Gibbs 
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sampler with identical Markov chain initialisation to that used in the AdPMCMC. In addition, we pre- 
tuned the MH-within-Gibbs sampler to have an acceptance rate of approximately 20%. Clearly, the poor 
mixing rate of the basic Gibbs sampler results in sub-optimal performance, requiring significantly longer 
chain to obtain the same accuracy as the AdPMCMC sampler. As a result, the proceeding analysis will 
continue with the AdPMCMC algorithm. 

Finally, we conclude this section with an analysis of the sample paths for the AdPMCMC Markov 
chain for the model parameters a and ft. These are again presented as a function of the SNR level after 
having been obtained via the adaptive MH proposal mechanism designed to sample these components 
of the state vector at each iteration of the Markov chain. The results are presented in Figs. 7 and 8 
for a and j3 respectively. We see clearly that the precision of the posterior and therefore the estimation 
accuracy for a is strongly affected by the SNR level. This was not found to be the case for our model 
when considering j3, clearly under the postulated model the posterior precision is not strongly affected by 
the SNR level. In addition we note that the MMSE estimates of the parameters for a and j3 was highly 
accurate at around 0.95 for all SNR. This is strong support of our proposed AdPMCMC algorithm for 
performing the joint estimation and channel tracking efficiently. 

B. Analysis of estimated MSE for the AdPMCMC algorithm channel estimates versus SNR 

In this section we consider an experiment in which the SNR is varied from OdB to 25dB in increments 
of 5dB. For each SNR level, 500 frames of data transmission are generated in which each frame is of 
length T = 100. The number of relays present is set to L = 1 and the true values of a and (3 used to 
generate the observations for each frame were equal at 0.95. For each data transmission the AdPMCMC 
algorithm is performed for 25,000 iterations, with the first 10,000 samples discarded as burnin. Then using 
the remaining Markov chain paths we obtained an estimate of the MMSE for the channels hi-x, gi : T for 
each of the transmitted frames. We then obtain the total MSE for each frame and plot the box-whisker plot 
of the distribution of the MSE as a function of SNR for the transmitted frames, under our AdPMCMC 
estimation approach. The results of this analysis are presented in Fig. 9. Clearly, there is a decrease in the 
Total MSE as the SNR increases. In addition we note that as expected, since the estimate of the channels 
gi-T is performed using the Rao-Blackwellising conditionally optimal Kalman Filter, this is reflected in 
the level of the total MSE. The estimates for g\ : T are clearly more accurate than those for the particle 
filter sampled estimates hi-T- 

In addition we present the distribution of the estimated MMSE for the parameters a and /3 over the 
500 frames as a function of the SNR. Clearly as the SNR increases, the estimates converge to the true 
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parameter values and in addition the precision in the MMSE estimates over each independent frame, 
reduces. This clearly appears in Fig. 10. 

In Fig. 11 we present the path space estimate for the BCRLB after marginalizing out the model 
parameters according to eq. (15). This is presented for an SNR=15dB after simulation of the PMCMC 
algorithm for 500 independently generated data sets, for PMCMC chains each of length J=25,000 
iterations. We also present the estimated average MSE for the MMSE estimation of h± : T and g\-x 
and three standard deviations for this average MSE estimate at each step in the path space for these 
simulations. For each data set the MSE is estimated by splitting the Markov chain into subblocks of 
length 1,000 samples on which the MMSE is estimated. This estimate is then used to form an estimate 
of the MSE for the chain corresponding to the particular data set, finally these results are averaged over 
the data sets. The results demonstrate that the estimate of BCRLB is always close to the mean and easily 
within 3 standard deviations, indicating again that our PMCMC methodology and associated channel 
estimates perform close to optimally. 

VII. Conclusions 

We introduced a new approach for channel tracking and online parameter estimation in cooperative 
wireless relay networks. We developed a novel algorithm to efficiently solve the problem of joint 
channel tracking and parameters estimation within a mobile wireless relay network. This is based on 
a novel version of the PMCMC sampler. Simulation results demonstrate the effectiveness of the proposed 
algorithm in very high dimensional state-space. Though not the topic of this paper, we note that the 
proposed estimation methods can be trivially extended to also cover frequency selective channels. 

References 

[1] E. Van Der Meulen, "Three-terminal communication channels," Advances in Applied Probability, vol. 3, no. 1, pp. 120-154, 
1971. 

[2] A. Nosratinia, T. Hunter, and A. Hedayat, "Cooperative communication in wireless networks," IEEE Communications 

Magazine, vol. 42, no. 10, pp. 74-80, 2004. 
[3] J. Laneman, D. Tse, and G. Wornell, "Cooperative diversity in wireless networks: Efficient protocols and outage behavior," 

Information Theory, IEEE Transactions on, vol. 50, no. 12, pp. 3062-3080, 2004. 
[4] J. Laneman and G. Wornell, "Distributed space-time-coded protocols for exploiting cooperative diversity in wireless 

networks," Information Theory, IEEE Transactions on, vol. 49, no. 10, pp. 2415-2425, 2003. 
[5] L. Tong, B. Sadler, and M. Dong, "Pilot-assisted wireless transmissions: general model, design criteria, and signal 

processing," Signal Processing Magazine, IEEE, vol. 21, no. 6, pp. 12-25, 2004. 
[6] S. Coleri, M. Ergen, A. Puri, and A. Bahai, "Channel estimation techniques based on pilot arrangement in OFDM systems," 

Broadcasting, IEEE Transactions on, vol. 48, no. 3, pp. 223-229, 2002. 



November 25, 2010 



DRAFT 



22 



[7] Y. Li, "Pilot-symbol-aided channel estimation for OFDM in wireless systems," Vehicular Technology, IEEE Transactions 

on, vol. 49, no. 4, pp. 1207-1215, 2000. 
[8] F. Tufvesson and T. Maseng, "Pilot assisted channel estimation for OFDM in mobile cellularsystems," vol. 3, 1997. 
[9] M. Morelli and U. Mengali, "A comparison of pilot-aided channel estimation methods for OFDMsystems," Signal 

Processing, IEEE Transactions on [see also Acoustics, Speech, and Signal Processing, IEEE Transactions on], vol. 49, 

no. 12, pp. 3065-3073, 2001. 

[10] F. Gao, T. Cui, and A. Nallanathan, "On channel estimation and optimal training design for amplify and forward relay 

networks," IEEE Transactions on Wireless Communications, vol. 7, no. 5, pp. 1907-1916, 2008. 
[11] K. Yan, S. Ding, Y. Qiu, Y. Wang, and H. Liu, "A low-complexity LMMSE channel estimation method for OFDM-based 

cooperative diversity systems with multiple amplify-and-forward relays," EURASIP Journal on Wireless Communications 

and Networking, vol. 2008, pp. 49 803^19 803, 2008. 
[12] A. Behbahani and A. Eltawil, "On channel estimation and capacity for amplify and forward relay networks," in Global 

Telecommunications Conference, 2008. IEEE GLOBECOM 2008. IEEE. IEEE, 2008, pp. 1-5. 
[13] H. Wang and P. Chang, "On verifying the first-order Markovian assumption for a Rayleighfading channel model," Vehicular 

Technology, IEEE Transactions on, vol. 45, no. 2, pp. 353-357, 1996. 
[14] C. Komninakis, C. Fragouli, A. Sayed, and R. Wesel, "Multi-input multi-output fading channel tracking and equaliza- 

tionusing Kalman estimation," IEEE Transactions on Signal Processing, vol. 50, no. 5, pp. 1065-1076, 2002. 
[15] C. Andrieu, A. Doucet, and R. Holenstein, "Particle Markov chain Monte Carlo methods," Journal of the Royal Statistical 

Society Series B, vol. 72, no. 2, pp. 1-33, 2010. 
[16] J. Laneman and G. Womell, "Exploiting distributed spatial diversity in wireless networks," in Proc. Allerton Conference 

on Communications, Control, and Computing, 2000. 
[17] W. Jakes and D. Cox, Microwave Mobile Communications. Wiley-IEEE Press, 1994. 

[18] C. Patel and G. Stuber, "Channel estimation for amplify and forward relay based cooperation diversity systems," IEEE 

Transactions on Wireless Communications, vol. 6, no. 6, pp. 2348-2356, 2007. 
[19] Y. Bar-Shalom, X. Li, and T. Kirubarajan, Estimation with applications to tracking and navigation. Wiley-Interscience, 

2001. 

[20] J. Durbin and S. Koopman, Time series analysis by state space methods. Oxford Univ Pr, 2001. 

[21] R. Silva, P. Giordani, R. Kohn, and M. Pitt, "Particle filtering within adaptive Metropolis Hastings sampling," Arxiv preprint 
arXiv:091 1.0230, 2009. 

[22] C. Carter and R. Kohn, "On Gibbs sampling for state space models," Biometrika, vol. 81, no. 3, p. 541, 1994. 
[23] A. Doucet, N. de Freitas, K. Murphy, and S. Russell, "Rao-Blackwellised particle filtering for dynamic Bayesian networks," 
pp. 176-183, 2000. 

[24] C. Andrieu and G. Roberts, "The pseudo-marginal approach for efficient Monte Carlo computations," Ann. Statist, vol. 37, 
no. 2, pp. 697-725, 2009. 

[25] A. Doucet and A. Johansen, "A tutorial on particle filtering and smoothing: Fifteen years later," Handbook of Nonlinear 
Filtering, 2009. 

[26] P. Del Moral, Feynman-Kac formulae: genealogical and interacting particle systems with applications. Springer Verlag, 
2004. 

[27] A. Doucet, S. Godsill, and C. Andrieu, "On sequential Monte Carlo sampling methods for Bayesian filtering," Statistics 
and computing, vol. 10, no. 3, pp. 197-208, 2000. 



November 25, 2010 



DRAFT 



23 



[28] G. Roberts and J. Rosenthal, "Examples of adaptive MCMC," Journal of Computational and Graphical Statistics, vol. 18, 
no. 2, pp. 349-367, 2009. 

[29] Y. Atchade and J. Rosenthal, "On adaptive markov chain monte carlo algorithms," Bernoulli, vol. 11, no. 5, pp. 815-828, 
2005. 

[30] H. Haario, E. Saksman, and J. Tamminen, "An adaptive Metropolis algorithm," Bernoulli, vol. 7, no. 2, pp. 223-242, 2001. 
[31] C. Andrieu and E. Moulines, "On the ergodicity properties of some adaptive MCMC algorithms," Annals of Applied 

Probability, vol. 16, no. 3, pp. 1462-1505, 2006. 
[32] C. Andrieu and Y. Atchade, "On the efficiency of adaptive MCMC algorithms," Proceedings of the 1st international 

conference on Performance evaluation methodolgies and tools, p. 43, 2006. 
[33] G. Roberts and J. Rosenthal, "Optimal scaling for various Metropolis-Hastings algorithms," Statistical Science, pp. 35 1-367, 

2001. 

[34] P. Tichavsky, C. Muravchik, and A. Nehorai, "Posterior Cramer-Rao bounds for discrete-time nonlinear filtering," IEEE 
Transactions on Signal Processing, vol. 46, no. 5, pp. 1386-1396, 1998. 




Relay 1 



W 



<X)_ *@ J Relay 2 



U l (L) 



Relay L 




Fig. 1. System model of the relay network with L relays and a dual hop 
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Algorithm 1 Deterministic Scan Gibb Sampler. 
1: Set initial state [a, (3, gi-.r, hi:T, w 1:t](0) deterministically or by sampling the priors; 

Repeat j- 1 to J 

2: Sample [a, f3](j + 1) ~ p (a, (3\[g 1:T , h 1:T , w 1:T ](j), y^) 

Repeat k = 1 to K 

3: Sample [G (fc _i )r+ i :fcr ](j+l) ~ p (g(fc_i) r+ i :fcr |[a, (3, gi :(fe _i) r ](j + 1), [gkr+i-.Tr, h 1:T , wi :T ](j), yfft 
Repeat k = 1 to K 

4: Sample [h. {k _ 1)T+1 . kT ](j +1) ~ p (h (fc _ 1)r+1:(tr |[a, (3, h 1:(fe _ 1)r , gi :T ](j + 1), [hjtr+l:Tr, Wi :T ](j), yf.^ 
Repeat = 1 to K 

5: Sample [w (fc _ 1)r+1:fer ](j+l) ~ p (w (fc _ 1)T+1 . fcr | [a, (3, w 1:(fc _ 1)r , hi :T , gid(j + 1), [wjfc T+ i : Tr]0')>?/i:T 




Fig. 2. Graphical model of the relay system with a dual hop 
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Algorithm 2 Generic PMCMC algorithm 



Initialise Markov chain state: 

1: Initialise [a, /3, hi : T, gi:T> Wu] (1) by sampling each value from the corresponding priors, see 
section II-A1. 
Begin PMCMC iterations: 
2: for j = 1, . . . , J do 
Sample static parameters for a, [3 from adaptive mixture proposal, see Algorithm 3 
3: Sample [a,0\* (j + 1) ~ q([a,0\ (j); [a, (3} (j + 1)). 

Construct and sample path space realisations for hx : T, gi-.T, Wi : t from Rao-Blackwellised SIR filter 

proposal, see Algorithm 4 
4: Run an SMC algorithm with TV particles to obtain: 

N 

p(hi :T ,gi :T ,wi :T |yi : T, [a,0\* (j + 1)) = [Hi :T ] (j + l,^[h 1:T ,g 1:T ,w 1:T ](j+i,i) (hi.-r.gi:T,wi : r) 

i=i 

T / iV \ 

'(y 1:T | [a,/3]* (j + 1)) = II huEfe] C7 + M) 

t=i \ i=i j 



t=i \ i=i J 

(21) 

5: Sample proposed path space [h 1:T , gi :T , Wi :T ]* (j + 1) ~ p (h 1:T , g 1:T , wi :T |yi :T , [at,(3]* {j + 1)) 
Combine proposals and calculate PMCMCM acceptance probability 

6: Combine both sampled proposals to obtain [a, (3, h\ : T, gi-T, w^]* (j + 1) 
7: Accept the proposed new Markov chain state comprised of [a, (3, h 1: ^, gi : *r, w 1; f]* (j + 1) with 
acceptance probability given by 

A ([a, (3, hi : T, gi:T, Wi :T ] (j); [a, /3, hi :T , gi :T , Wi :T ]* (j + 1)) 

L p(y 1:T | [a,/3]* (j + l))p([a,/3]* (j + 1)) q ([a,/3]* (j + 1); [a,/3] (j))^ (22) 



p(y 1:T | [a,/3] (j))p([«,/3] (j)) g([a,/3] (j); [a,/3f (j + 1)) 
where p(yi : T| [<*>/3] (j)) is obtained from the previous iteration of the PMCMC algorithm. 
8: Sample a realisation u\ of random variable U\ ~ £7[0, 1] 
9: if ui < A([a,/3,hi :T ,gi : T,wi : T] (j) [a, f3,h 1:T ,g hT ,wi: T ]* (j + 1)) then 
10: Accept proposed state [a, /3, h 1:T , g 1:T , wu] (j + 1) = [a, /3, hi :T , gi :T , wi :T ]* (j + 1) for 

iteration j + 1 
ll: else 

12: [a, /3, hi :T , gi:T, Wu] (j + 1) = [a, /3, h 1:T , gi;T, Wi; T ] (i) 

13: end if 
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Algorithm 3 Adaptive MCMC for static parameters a, /3 
1: Sample a realisation u\ of random variable U\ ~ U[0, 1] 

Sample from the adaptive mixture proposal (13) 

2: if «i > w\ then 

Sample [o, f3] (j + 1) from the adaptive component of the mixture proposal of (13) 
3: Estimate Sy, the empirical covariance of a,/3, using samples {[a, 0\(i)}i=i-.j ■ 
4: Sample proposal [a, f3\ (j + 1) ~ N (a, (3; [a, [3] (j), ^f 1 ^) ; 
5: else 

Sample [ol, f3] (j + 1) from the non-adaptive component of the mixture proposal of (13) 
6: Sample proposal [a, (3} (j + 1) ~ N (a, (3; [a, f3] (j), ^-I d ,dj 
7: end if 




NumbGr of particles 



Fig. 3. AdPMCMC: Acceptance probability for different number of particles 




20 40 60 80 100 20 40 60 !0 100 20 40 60 80 100 



Fig. 4. AdPMCMC: MMSE for h with 95% posterior CI for low, medium and high SNRs, T = 100, N = 100. Note, true 
latent process is presented in green, the MMSE estimate in blue and posterior confidence intervals in dashed red line. 
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Algorithm 4 Construction of proposal p (hi : T, gv.T, wi-.tWi-.t, (j + 1)), given [oc,/3] (j + 1) 

Initialisation of Rao-Blackwellised SIR filter at iteration {j + 1) of the Markov chain 

1: SIR particle filter for hi : y,wi : r: initialise TV particles {[hi, wi] (j + l,i)} i=17V via sampling from 

the priors in Section II. 

2: Rao-Blackwellised Kalman filter for gi-r- initialise the mean and covariance of gi for each particle 

denoted by S x ] (j + 1, i)} i=1:N 
3: for t = 2,...,Tdo 

Perform mutation of the N particles at frame time t — 1 to obtain new particles at t via state evolution. 
4: Sample the i-th particle [h t ] (j + from particle filter proposal 



~ CM Ma] {j + 1) [h t _i] (j + 1, i), y 1 - ([a] (j + 1)) 2 J , according to state equation (5) . 
5: Sample the i-th particle [w t ] (j + from the prior, see Section EL 
Perform Kalman filter evolution for sufficient statistics of g n . 

6: For each particle, to obtain {[/%, E n ] (j + 1, i)}i=i-jv a PP^ recursions in algorithm 3. 
Incremental SIR importance sampling weight correction. 
7: Evaluate the unnormalised importance sampling weights, E t (j + 1, i), for the N particles, with 
the i-th weight given by 

(j + 1, t) oc [St_i] (j + 1, i) [Ct-i] (j + 1, *) 

(23) 

« [ s *-i] (i + !^My*l [ht,gt, w t ,a,/3] (j + 
8: Normalise the importance sampling weights [^] (j + = 
Evaluate the importance estimate and resample adaptively. 
9: Calculate the Effective Sample size, ESS = „ N r , 1 , . .., 

X/ 4=I Kt-iJU+l>*) 

10: If the Effective Sample size is less than 80% then resample the particles at time t using stratified 
resampling based on the empirical distribution constructed from the importance weights to obtain 
new particles with equal weight. 

11: end for 

12: Evaluate marginal likelihood p(y UT \ [a,f3] (j + 1)) = l\J=i (j? Eti fe] + 1 ,' 
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Algorithm 5 Kalman filter recursion for [g n ] (J, i) particle i 
l: sample w ~ CM (0, 1) 

2: Predict; 

[/*n|n-l] (j>») = [)*n-l|n-l] (j, *) + \/ 1 ~ C/)) 2 W 

[£»|n-l] = [/i n -l|n-l] CM) [S n _i|„_i] [/X n _!| n _i] (j',i) T + \Jl - ([/3] (j)) 2 

3: Update 

y = Yn - / 0„ [h„] (j, i), n) [/i n | n _i] (j, i) 

S = / (s n [h n ] (j, i), n) [/^ n | n _l] (j, i) [S n |n-l] (j, 0/ ( S « [ h n] (J) *)> n ) [Mnln-l] (j, «) + <U 

K = [E n | re _i] (s n [h n ] (j,i),n) [/^ n | re _i] (j,i) T S^ 1 

[/*n|n] = (j>*) + K y 

[S„| n ] Cj'jO = (l-K/(s„[h n ] (j,i),n) [/x n | n _i] (?",*)) [£ n | n _i] 




Fig. 5. AdPMCMC: MMSE for g with 95% posterior CI for low, medium and high SNRs, T = 100, N = 100. Note, true 
latent process is presented in green, the MMSE estimate in blue and posterior confidence intervals in dashed red line. 
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D 10 20 30 40 50 60 70 80 90 100 



Fig. 6. Gibbs: MMSE for h (top panel) and g (lower panel) with 95% posterior CI for medium SNR, T = 100. Note, true 
latent process is presented in green, the MMSE estimate in blue and posterior confidence intervals in dashed red line. 




Fig. 7. AdPMCMC: sample path for a for low, medium and high SNRs, T = 100, N = 100 




Fig. 8. AdPMCMC: sample path for /3 for low, medium and high SNRs, T = 100, N = 100 
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(a) 



(b) 



Fig. 9. Distribution of estimated MSE of h (left panel) and h (right panel) vs. SNR over 500 frames, T=100, N=100. The 
solid line represents the average MSE vs SNR. 



(a) (b) 
Fig. 10. boxplot MMSE for a (left panel) and /3 (left panel) vs. SNR over 40 frames, T=100, N=100. 




(a) BCRLB for gi :T (b) BCRLB for h 1:T 

Fig. 11. BCRLB estimated after marginalizing out uncertainty from model parameters a, (3 and averaged over 500 data 
realizations. Each PMCMC Markov chain was simulated for J = 25, 000 iterations. 
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